SecretP: identifying bacterial secreted proteins by fusing new features into Chou's pseudo-amino acid composition

J Theor Biol. 2010 Nov 7;267(1):1-6. doi: 10.1016/j.jtbi.2010.08.001. Epub 2010 Aug 5.

Abstract

Protein secretion plays an important role in bacterial lifestyles. Secreted proteins are crucial for bacterial pathogenesis by making bacteria interact with their environments, particularly delivering pathogenic and symbiotic bacteria into their eukaryotic hosts. Therefore, identification of bacterial secreted proteins becomes an important process for the study of various diseases and the corresponding drugs. In this paper, fusing several new features into Chou's pseudo-amino acid composition (PseAAC), two support vector machine (SVM)-based ternary classifiers are developed to predict secreted proteins of Gram-negative and Gram-positive bacteria. For the two types of bacteria, the high accuracy of 94.03% and 94.36% are obtained in distinguishing classically secreted, non-classically secreted and non-secreted proteins by our method. In order to compare the practical ability of our method in identifying bacterial secreted proteins with those of six published methods, proteins in Escherichia coli and Bacillus subtilis are collected to construct the test sets of Gram-negative and Gram-positive bacteria, and the prediction results of our method are comparable to those of existing methods. When performed on two public independent data sets for predicting NCSPs, it also yields satisfactory results for Gram-negative bacterial proteins. The prediction server SecretP can be accessed at http://cic.scu.edu.cn/bioinformatics/secretPV2/index.htm.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / analysis*
  • Bacillus subtilis / metabolism
  • Bacterial Proteins / analysis*
  • Bacterial Proteins / metabolism
  • Computational Biology / methods*
  • Escherichia coli / metabolism
  • Gram-Negative Bacteria / metabolism
  • Gram-Positive Bacteria / metabolism
  • Internet
  • Neural Networks, Computer*

Substances

  • Amino Acids
  • Bacterial Proteins